Lamar County
WavePulse: Real-time Content Analytics of Radio Livestreams
Mittal, Govind, Gupta, Sarthak, Wagle, Shruti, Chopra, Chirag, DeMattee, Anthony J, Memon, Nasir, Ahamad, Mustaque, Hegde, Chinmay
Radio remains a pervasive medium for mass information dissemination, with AM/FM stations reaching more Americans than either smartphone-based social networking or live television. Increasingly, radio broadcasts are also streamed online and accessed over the Internet. We present WavePulse, a framework that records, documents, and analyzes radio content in real-time. While our framework is generally applicable, we showcase the efficacy of WavePulse in a collaborative project with a team of political scientists focusing on the 2024 Presidential Elections. We use WavePulse to monitor livestreams of 396 news radio stations over a period of three months, processing close to 500,000 hours of audio streams. These streams were converted into time-stamped, diarized transcripts and analyzed to track answer key political science questions at both the national and state levels. Our analysis revealed how local issues interacted with national trends, providing insights into information flow. Our results demonstrate WavePulse's efficacy in capturing and analyzing content from radio livestreams sourced from the Web. Code and dataset can be accessed at \url{https://wave-pulse.io}.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- North America > United States > New York > Kings County > New York City (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (215 more...)
- Media > Radio (1.00)
- Leisure & Entertainment (1.00)
- Government > Voting & Elections (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
Specifications: The missing link to making the development of LLM systems an engineering discipline
Stoica, Ion, Zaharia, Matei, Gonzalez, Joseph, Goldberg, Ken, Sen, Koushik, Zhang, Hao, Angelopoulos, Anastasios, Patil, Shishir G., Chen, Lingjiao, Chiang, Wei-Lin, Davis, Jared Q.
Despite the significant strides made by generative AI in just a few short years, its future progress is constrained by the challenge of building modular and robust systems. This capability has been a cornerstone of past technological revolutions, which relied on combining components to create increasingly sophisticated and reliable systems. Cars, airplanes, computers, and software consist of components-such as engines, wheels, CPUs, and libraries-that can be assembled, debugged, and replaced. A key tool for building such reliable and modular systems is specification: the precise description of the expected behavior, inputs, and outputs of each component. However, the generality of LLMs and the inherent ambiguity of natural language make defining specifications for LLM-based components (e.g., agents) both a challenging and urgent problem. In this paper, we discuss the progress the field has made so far-through advances like structured outputs, process supervision, and test-time compute-and outline several future directions for research to enable the development of modular and reliable LLM-based systems through improved specifications.
- North America > Canada (0.14)
- Pacific Ocean > North Pacific Ocean > San Francisco Bay > Golden Gate (0.04)
- North America > United States > Mississippi (0.04)
- (9 more...)
- Information Technology (1.00)
- Automobiles & Trucks > Manufacturer (0.93)
- Leisure & Entertainment (0.92)
- Transportation (0.88)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)
Sufficient Context: A New Lens on Retrieval Augmented Generation Systems
Joren, Hailey, Zhang, Jianyi, Ferng, Chun-Sung, Juan, Da-Cheng, Taly, Ankur, Rashtchian, Cyrus
Augmenting LLMs with context leads to improved performance across many applications. Despite much research on Retrieval Augmented Generation (RAG) systems, an open question is whether errors arise because LLMs fail to utilize the context from retrieval or the context itself is insufficient to answer the query. To shed light on this, we develop a new notion of sufficient context, along with a way to classify instances that have enough information to answer the query. We then use sufficient context to analyze several models and datasets. By stratifying errors based on context sufficiency, we find that proprietary LLMs (Gemini, GPT, Claude) excel at answering queries when the context is sufficient, but often output incorrect answers instead of abstaining when the context is not. We further categorize cases when the context is useful, and improves accuracy, even though it does not fully answer the query and the model errs without the context. Building on our findings, we explore ways to reduce hallucinations in RAG systems, including a new selective generation method that leverages sufficient context information for guided abstention. Our method improves the fraction of correct answers among times where the model responds by 2-10% for Gemini, GPT, and Gemma. Providing Large Language Models (LLMs) with additional context, such as in Retrieval Augmented Generation (RAG) systems, has led to major improvements in LLM factuality and verifiability when adapting to new domains (Lewis et al., 2020). In the case of open-domain question answering, a retrieval model provides context at inference time in the form of snippets or long-form text (Zhu et al., 2021). Then, the model synthesizes the query along with this added context to generate the answer. The ideal outcome is for the LLM to output the correct answer if the provided context contains enough information to answer the question when combined with the model's parametric knowledge. Otherwise, the model should abstain from answering and/or ask for more information. One core challenge in achieving this ideal outcome is building models that can use the provided context only when it helps answer the question correctly. Several works have investigated this issue by evaluating models in the presence of irrelevant information in the context (discussed in Section 2). However, "relevant information" can range from directly containing the answer to simply being topically related Work done during an internship at Google. Work done during an internship at Google. Question: Who is Lya L. married to?
- North America > United States > New York (0.04)
- Europe > United Kingdom > Northern Ireland (0.04)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- (11 more...)
- Transportation > Ground > Rail (0.47)
- Leisure & Entertainment > Sports (0.46)
SyROCCo: Enhancing Systematic Reviews using Machine Learning
Fang, Zheng, Arana-Catania, Miguel, van Lier, Felix-Anselm, Velarde, Juliana Outes, Bregazzi, Harry, Airoldi, Mara, Carter, Eleanor, Procter, Rob
The sheer number of research outputs published every year makes systematic reviewing increasingly time- and resource-intensive. This paper explores the use of machine learning techniques to help navigate the systematic review process. ML has previously been used to reliably 'screen' articles for review - that is, identify relevant articles based on reviewers' inclusion criteria. The application of ML techniques to subsequent stages of a review, however, such as data extraction and evidence mapping, is in its infancy. We therefore set out to develop a series of tools that would assist in the profiling and analysis of 1,952 publications on the theme of 'outcomes-based contracting'. Tools were developed for the following tasks: assign publications into 'policy area' categories; identify and extract key information for evidence mapping, such as organisations, laws, and geographical information; connect the evidence base to an existing dataset on the same topic; and identify subgroups of articles that may share thematic content. An interactive tool using these techniques and a public dataset with their outputs have been released. Our results demonstrate the utility of ML techniques to enhance evidence accessibility and analysis within the systematic review processes. These efforts show promise in potentially yielding substantial efficiencies for future systematic reviewing and for broadening their analytical scope. Our work suggests that there may be implications for the ease with which policymakers and practitioners can access evidence. While ML techniques seem poised to play a significant role in bridging the gap between research and policy by offering innovative ways of gathering, accessing, and analysing data from systematic reviews, we also highlight their current limitations and the need to exercise caution in their application, particularly given the potential for errors and biases.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Asia > India (0.04)
- South America (0.04)
- (7 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
- Law (1.00)
- Banking & Finance (0.93)
- Education (0.93)
- (2 more...)
The Knowledge Alignment Problem: Bridging Human and External Knowledge for Large Language Models
Zhang, Shuo, Pan, Liangming, Zhao, Junzhou, Wang, William Yang
Large language models often necessitate grounding on external knowledge to generate faithful and reliable answers. Yet even with the correct groundings in the reference, they can ignore them and rely on wrong groundings or their inherent biases to hallucinate when users, being largely unaware of the specifics of the stored information, pose questions that might not directly correlate with the retrieved groundings. In this work, we formulate this knowledge alignment problem and introduce MixAlign, a framework that interacts with both the human user and the knowledge base to obtain and integrate clarifications on how the user question relates to the stored information. MixAlign employs a language model to achieve automatic knowledge alignment and, if necessary, further enhances this alignment through human user clarifications. Experimental results highlight the crucial role of knowledge alignment in boosting model performance and mitigating hallucination, with improvements noted up to 22.2% and 27.1% respectively. We also demonstrate the effectiveness of MixAlign in improving knowledge alignment by producing high-quality, user-centered clarifications.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.05)
- North America > United States > New York (0.05)
- (9 more...)
- Leisure & Entertainment > Sports (0.98)
- Media (0.93)
Digital Twins in Wind Energy: Emerging Technologies and Industry-Informed Future Directions
Stadtman, Florian, Rasheed, Adil, Kvamsdal, Trond, Johannessen, Kjetil André, San, Omer, Kölle, Konstanze, Tande, John Olav Giæver, Barstad, Idar, Benhamou, Alexis, Brathaug, Thomas, Christiansen, Tore, Firle, Anouk-Letizia, Fjeldly, Alexander, Frøyd, Lars, Gleim, Alexander, Høiberget, Alexander, Meissner, Catherine, Nygård, Guttorm, Olsen, Jørgen, Paulshus, Håvard, Rasmussen, Tore, Rishoff, Elling, Scibilia, Francesco, Skogås, John Olav
This article presents a comprehensive overview of the digital twin technology and its capability levels, with a specific focus on its applications in the wind energy industry. It consolidates the definitions of digital twin and its capability levels on a scale from 0-5; 0-standalone, 1-descriptive, 2-diagnostic, 3-predictive, 4-prescriptive, 5-autonomous. It then, from an industrial perspective, identifies the current state of the art and research needs in the wind energy sector. The article proposes approaches to the identified challenges from the perspective of research institutes and offers a set of recommendations for diverse stakeholders to facilitate the acceptance of the technology. The contribution of this article lies in its synthesis of the current state of knowledge and its identification of future research needs and challenges from an industry perspective, ultimately providing a roadmap for future research and development in the field of digital twin and its applications in the wind energy industry.
- Europe > Denmark (0.14)
- Europe > Norway > Central Norway > Trøndelag > Trondheim (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (22 more...)
- Overview (1.00)
- Research Report > New Finding (0.45)
- Energy > Renewable > Wind (1.00)
- Government > Regional Government > North America Government > United States Government (0.67)
InGram: Inductive Knowledge Graph Embedding via Relation Graphs
Lee, Jaejun, Chung, Chanyoung, Whang, Joyce Jiyoung
Inductive knowledge graph completion has been considered as the task of predicting missing triplets between new entities that are not observed during training. While most inductive knowledge graph completion methods assume that all entities can be new, they do not allow new relations to appear at inference time. This restriction prohibits the existing methods from appropriately handling real-world knowledge graphs where new entities accompany new relations. In this paper, we propose an INductive knowledge GRAph eMbedding method, InGram, that can generate embeddings of new relations as well as new entities at inference time. Given a knowledge graph, we define a relation graph as a weighted graph consisting of relations and the affinity weights between them. Based on the relation graph and the original knowledge graph, InGram learns how to aggregate neighboring embeddings to generate relation and entity embeddings using an attention mechanism. Experimental results show that InGram outperforms 14 different state-of-the-art methods on varied inductive learning scenarios.
- North America > United States > California (0.04)
- North America > United States > Texas > Lamar County > Paris (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- (3 more...)
An A.I.-Generated Film Depicts Human Loneliness, in "Thank You for Not Answering"
In the first thirty seconds of the director and artist Paul Trillo's short film "Thank You for Not Answering," a woman gazes out the window of a subway car that appears to have sunk underwater. A man appears in the window swimming toward the car, his body materializing from the darkness and swirling water. It's a frightening, claustrophobic, violent scene--one that could have taken hundreds of thousands of dollars of props and special effects to shoot, but Trillo generated it in a matter of minutes using an experimental tool kit made by an artificial-intelligence company called Runway. The figures in the film appear real, played by humans who may actually be underwater. But another glance reveals the uncanniness in their blank eyes, distended limbs, mushy features.
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
Improving Toponym Resolution with Better Candidate Generation, Transformer-based Reranking, and Two-Stage Resolution
Geocoding is the task of converting location mentions in text into structured data that encodes the geospatial semantics. We propose a new architecture for geocoding, GeoNorm. GeoNorm first uses information retrieval techniques to generate a list of candidate entries from the geospatial ontology. Then it reranks the candidate entries using a transformer-based neural network that incorporates information from the ontology such as the entry's population. This generate-and-rerank process is applied twice: first to resolve the less ambiguous countries, states, and counties, and second to resolve the remaining location mentions, using the identified countries, states, and counties as context. Our proposed toponym resolution framework achieves state-of-the-art performance on multiple datasets. Code and models are available at \url{https://github.com/clulab/geonorm}.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Washington > King County > Seattle (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- (18 more...)
A debate between AI experts shows a battle over the technology's future
Since the 1950s, artificial intelligence has repeatedly overpromised and underdelivered. While recent years have seen incredible leaps thanks to deep learning, AI today is still narrow: it's fragile in the face of attacks, can't generalize to adapt to changing environments, and is riddled with bias. All these challenges make the technology difficult to trust and limit its potential to benefit society. On March 26 at MIT Technology Review's annual EmTech Digital event, two prominent figures in AI took to the virtual stage to debate how the field might overcome these issues. Gary Marcus, professor emeritus at NYU and the founder and CEO of Robust.AI, is a well-known critic of deep learning.
- North America > United States > Texas > Lamar County > Paris (0.04)
- Europe > France > Île-de-France > Paris > Paris (0.04)